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Abstract 

We consider the problem of estimating the unknown response function in the multichannel 
deconvolution model with a boxcar-like kernel which is of particular interest in signal process- 
ing. It is known that, when the number of channels is finite, the precision of reconstruction of 
the response function increases as the number of channels M grow (even when the total number 
of observations n for all channels M remains constant) and this requires that the parameter 
of the channels form a Badly Approximable M-tuplc. 

Recent advances in data collection and recording techniques made it of urgent interest 
to study the case when the number of channels M = M n grow with the total number of 
observations n. However, in real-life situations, the number of channels M — M n usually 
refers to the number of physical devices and, consequently, may grow to infinity only at a 
slow rate as n — > oo. Unfortunately, existing theoretical results cannot be blindly applied to 
accommodate the case when M = M n — > oo as n — > oo. This is due to the fact that, to the 
best of our knowledge, so far no one have studied the construction of a Badly Approximable 
M-tuple of a growing length on a specified interval, of a non-asymptotic length, of the real line, 
as M is growing. Therefore, this generalization requires non-trivial results in number theory. 

When M = M n grows slowly as n increases, we develop a procedure for the construction of 
a Badly Approximable Af-tuple on a specified interval, of a non-asymptotic length, together 
with a lower bound associated with this M-tuple, which explicitly shows its dependence on M 
as M is growing. This result is further used for the evaluation of the L 2 -risk of the suggested 
adaptive wavelet thresholding estimator of the unknown response function and, furthermore, 
for the choice of the optimal number of channels M which minimizes the L 2 -risk. 
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1 Introduction 



We consider the estimation problem of the unknown response function /(•) £ L 2 (T) from obser- 
vations y(ui,ti), I = 1,2, ... , M, i = 1, 2, . . . , N, where 

y(u h ti) = J g(ui,ti - x)f(x)dx + En, ui^U, ti = {i-l)/N, (1.1) 

where U = [a, b], < a < b < oo, T = [0,1] and en are standard Gaussian random variables, 
independent for different I and i. We shall be interested in the case when the blurring (or kernel) 
function g(-, •) is the, so called, boxcar-like kernel, i.e., 

g( U ,t) = ^-I(\t\< U ), 

where 7(-) is some positive function such that 

7i < 7(«) < 72, u£ U, (1.2) 

for some < 71 < 72 < 00. (Obviously, this is true if 7(-) is a continuous function.) Hence, (jl.ip 
is of the form 

y(ui,U) = -^L [ I(\t t -x\ <ui)f(x)dx + e H , meU, U = {i-1)/N, (1.3) 
1 Jo 

for I = 1, 2, . . . , M and i = 1, 2, . . . , N. 

In signal processing, this model is referred to as a multichannel deconvolution model, where 
M is the number of channels and N is the number of observations per channel, so that n = MN 
is the total number of observations. We assume that the measurements ti in each channel are 
equispaced but the observer can choose the number of channels M and the points u\, I = 1, . . . , M, 
in (jl-lj) prior to the experiment as a part of experimental design. In order to be able to access 
convergence rates depending on the number of channels M and the choice of points Ui, I = 
1,2,... ,M, we shall further assume that the total number of observations n is fixed and very 
large (i.e., n — > 00). The objective is to choose M and ui, I = 1,2, . . . , M, which ensure the 
construction of an estimator of the response function / with the highest possible convergence 
rates in terms of n. 

Note that standard deconvolution (i.e., when a = b) with the boxcar kernel (i.e., when 
7(w) = 1/u, for some fixed u > 0) is a common model in many areas of signal and image processing 
which include, for instance, LIDAR remote sensing and reconstruction of blurred images. LIDAR 
is a lazer device which emits pulses, reflections of which are gathered by a telescope aligned with 
the lazer, see, e.g., Park et al. (1997) and Harsdorf & Reuter (2000). The return signal is used 
to determine distance and the position of the reflecting material. However, if the system response 
function of the LIDAR is longer than the time resolution interval, then the measured LIDAR 
signal is blurred and the effective accuracy of the LIDAR decreases. This loss of precision can 
be corrected by deconvolution. In practice, measured LIDAR signals are corrupted by additional 
noise which renders direct deconvolution impossible. If M > 2, then we talk about a multichannel 
deconvolution model with blurring functions gi(t) = g(ui,t). 

Although standard deconvolution models are traditionally solved using the Fourier transform 
or the Fourier series, if the corresponding blurring function g(-) is a boxcar-like kernel, implemen- 
tation of the standard Fourier series based technique is impossible. This happens when the Fourier 
transform of g(-) has real zeros, e.g., when g(-) is the boxcar kernel g(x) = (2u) _1 I(|rr| < u), for 
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some fixed u > 0. When M = 1, Johnstone ef a/. (2004) and Johnstone & Raimondo (2004) 
managed to circumvent this obstacle by considering a boxcar kernel g(-) with irrational scale. 
Their method is based on the fact that the Fourier coefficients of the boxcar kernel do not vanish 
at frequencies (nku) when u is a Badly Approximable (BA) number. An irrational number u is BA 
if the terms a n = a n (u) of its continued fraction expansion [do; 01,02 • • .], where a is an integer 
and 01,02, ... is an infinite sequence of positive integers, are bounded, i.e., sup n a n (u) < 00. This 
notion is related to the fact that a BA number cannot be approximated well by a rational number 
which leads to the fact that / can be recovered reasonably well. Since standard deconvolution is 
a particular example of linear statistical ill-posed inverse problems in the sense of Hadamard, i.e., 
the inversion does not depend continuously on the observed data, Johnstone & Raimondo (2004) 
used number theory to prove that the degree of ill-posedness in boxcar deconvolution is v = 3/2. 
Roughly speaking, the degree of ill-posedness specifies how much the error in the right-hand side 
of the equation is amplified in the solution. For example, if / belongs to a space with a smoothness 
index s > and the degree of ill-posedness is v > 0, then, the quadratic risk of the best possible 



De Canditiis & Pensky (2004, 2006), following mathematical ideas of Casey & Walnut 
(1994), extended the results of Johnstone et al. (2004) and Johnstone & Raimondo (2004), and 
showed that if M is finite, M > 2, one of the ui's is a BA number, and U\,U2, ■ ■ ■ ,um is a BA 
M-tuple, then the degree of ill-posedness is v = 1 + 1/(2M). The notion of a BA M-tuple refers to 
a collection of M irrational numbers which are difficult to approximate simultaneosly by fractions 
with the same denominator. It will be discussed in depth in Section [2j Therefore, in the case 
of M channels, the estimation problem requires a construction of a BA M-tuple which has been 
accomplished by the number theory community (it is described in, e.g., Schmidt (1969, 1980)). 

Recent advances in data collection and recording techniques made it of urgent interest to 
study the case when the number of channels M = M n grow with the total number of observations 
n. It turns out that when the number of channels M = M n grows fast as the total number of 
observations n increases, one does not need to make a special choice of the points u\, I = 1, 2 . . . , M, 
and it is sufficient to take them to be equidistant. Indeed, Pensky & Sapatinas (2010) considered 
the discrete multichannel deconvolution model (jl.ip as observations on the continuous functional 
deconvolution model 



where z(u,t) is assumed to be a two-dimensional Gaussian white noise, i.e., a generalized two- 
dimensional Gaussian field with covariance function E[#(iti, t\)z(u2, £2)] = S(u\ — U2)S(ti — £2), 
where 5(-) denotes the Dirac 5-function, and f*g(u, t) = J T g(u,t— x)f(x) dx with the blurring (or 
kernel) function g{-, •) assumed to be known. If a = b, the functional deconvolution model (|1.4j) 
reduces to the standard deconvolution model which attracted attention of a number of researchers, 
e.g., Donoho (1995), Abramovich & Silverman (1998), Kalifa &; Mallat (2003), Johnstone et al. 
(2004), Donoho & Raimondo (2004), Johnstone & Raimondo (2004), Neelamani et al. (2004), 
Kerkyacharian et al. (2007), Cavalier & Raimondo (2007) and Chesneau (2008), among others. 

Formulation of the functional deconvolution model (jl.4p allowed Pensky & Sapatinas (2010) 
to study the interplay between discrete and continuous deconvolution models. The ideal continuous 
deconvolution model (|1.4p assumes that one can measure y(u, t) at any u G U and t G T and 
that n -1 / 2 marks the precision of these observations. Nevertheless, this does not happen in 
real-life situations where one observes y(u,t) only at the points ui G U, t{ = (i — 1)/N for 
/ = 1, 2, . . . , M and i = 1,2, ... ,N. Pensky & Sapatinas (2010) showed that the degree of ill- 
posedness in the continuous deconvolution model (|1.4p is v = 1 and that it can be attained in the 



estimator of the response function / is of the order O ( n 2 S +2i/+i 




y(u, t) = f* g(u, t) + —= z(u, t) 



ueU, teT, 



(1.4) 
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discrete deconvolution model (jl.ip if M = M n > cqu 1 ' 3 for some constant cq > 0, independent of 
n. Indeed, in this case, one does not need to employ BA numbers or BA M-tuples: it is sufficient 
to observe the discrete deconvolution model (jl.l|) at equidistant points ui = a + l(b — a)/M, 

1 = 1,2,... , M. This set up provides the "best possible" minimax convergence rates (under the 
L 2 -risk and over a wide range of Besov balls) in the model. 

However, in real-life situations, the number of channels M usually refers to the number of 
physical devices and, consequently, cannot be very big. Therefore, M = M n > cqti 1 ^ may be 
impossible although it is natural to assume that M = M n may grow to infinity at a slower rate as 
n — > oo. Unfortunately, the theoretical results obtained by De Canditiis & Pensky (2006) cannot 
be blindly applied to accommodate the case when M = M n — > oo as n — > oo. This is due to 
the fact that, to the best of our knowledge, so far no one have studied the construction of a BA 
M-tuple of a growing length on a specified interval, of a non-asymptotic length, of the real line. 
Therefore, this generalization requires non-trivial results in number theory. 

Our aim is to investigate the situation when M = M n grows slowly with n and to derive 
necessary new results in number theory in order to devise a technique which allows to approach 
minimax convergence rates (under the L 2 -risk and over a wide range of Besov balls) in the con- 
tinuous model (jl.4p with a factor which grows slower than any power of n. This situation seems 
to be of a particular interest nowadays since data recording equipment is getting cheaper and 
cheaper while overall volumes of data is growing very fast. 

When M = M n grows slowly as n increases, we develop a procedure for the construction of 
a BA M-tuple on a specified interval, of a non-asymptotic length, together with a lower bound 
associated with this M-tuple, which explicitly shows its dependence on M as M is growing. This 
result is further used for evaluation of the L 2 -risk of the suggested adaptive wavelet thresholding 
estimator of the unknown response function and, furthermore, for the choice of the optimal number 
of channels M which minimizes the L 2 -risk. 

The theoretical results that we have obtained provide a cross-area between number theory, 
statistics and signal processing. We hope to alert the number theory community to a new problem 
of constructing a BA M-tuple on a specified interval, of a non-asymptotic length, of the real line, 
as M is growing. On the other hand, we believe that our findings will also be of interest to 
researchers in statistics and signal processing. 

The rest of the paper is organized as follows. Section [2] provides some number theory 
background which is required for understanding the material presented in subsequent sections. 
Section [3] briefly reviews the adaptive wavelet thresholding estimator introduced in Pensky & 
Sapatinas (2009). Section U] explains the relationship between the L 2 -risk of the estimator obtained 
in Section [3] and the theory of Diophantine approximation, thus, motivating the derivation of the 
new results in number theory obtained in Section G3 In particular, the objective of Section [5] is 
the construction of a BA M-tuple on a specified interval when M = M n — > oo as n — > and 
the development of related asymptotic bounds which are necessary in order to choose an optimal 
value of M = M n in this case. Section [6] provides the asymptotic upper bounds for the L 2 -risk of 
the adaptive wavelet thresholding estimator constructed in Section [3] when M = M n is a slowly 
growing function of n. We conclude in Section [7] with a brief discussion while Section [8] contains 
the proofs of the theoretical results obtained in earlier sections. 

2 Background Results in Number Theory 

The theory of Diophantine approximation is an important branch of number theory (see, e.g., 
Edixhoven & Evertse (1993), Lang (1966), Masser et al. (2003) and Schmidt (1980, 1991)). One 
important topic of the above theory is the simultaneous approximation of linear forms, which was 
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pursued as early as mid- 19th century by Dirichlet and later studied by a number of profound 
researchers in the field. In particular, it is known that for any real numbers P11P21 ■ ■ ■ iPm there 
exist integer numbers q and pi,p 2 , ■ ■ ■ , Pm such that 



max \Piq-pi 
i=l,2,. ..,M 



M 



l/M 



(2.1) 



(M + l) 



The above result was proved by Minkowski and has been expanded in the recent years to cover 
systems of linear forms (see, e.g., Schmidt (1980), p. 36, pp. 40-41). We note that in the case 
where M = 1, the constant C(M) = M/(M + 1) in (|2.ip reduces to 1/2 whereas, by Hurwitz's 
theorem, the best possible value is (see, e.g., Schmidt (1980), Theorem 2F, p. 6). For 

M = 2, C(M) takes the value 2/3; the best possible value is unknown although if Co(2) denotes 
the infimum of admissible values of C(M) for M = 2, then it is known that \/2/7 < Co (2) < 0.615 
(see, e.g., Schmidt (1980), p. 41). Furthermore, the corresponding best constant in the case of 
systems of linear forms is positive, meaning that it cannot be replaced by arbitrary small constants 
(see, e.g., Schmidt (1980), Section 4, pp. 41-47) . 

We, however, are interested in the opposite result. Namely, the real numbers Pi, p 2 , ■ ■ ■ , Pm 
form a BA M -tuple if for any integer numbers q > and pi,P2, ■ ■ ■ ,pm one has 



for some constant B{M) > 0, dependent on M (and Pi,P 2 , • • • , Pm) Du t, independent of q and 
Pi,P2, ■ ■ ■ ,Pm (see, e.g., Schmidt (1980), p. 42). It is well-known that the set of all BA M-tuples 
has Lebesgue measure zero, but nevertheless this set is quite large, namely there are uncountably 
many BA M-tuples (see Cassels (1955), Davenport (1962)) and the Hausdorff dimension of the 
set of all BA Af-tuples is equal to M (see Schmidt (1969)). In the case where M = 1, the number 
P = Pi which satisfies (|2.2p is referred to in the Diophantine approximation literature as a BA 
number (see, e.g., Schmidt (1980), p. 22); in view of Hurwitz's theorem, the constant B(l) in 
this case must satisfy < B(l) < \j\fh (see, e.g., Schmidt (1980), pp. 41-42). Furthermore, a 
characterization result exists, namely a real number, that is not an integer, is BA if and only if its 
continued fraction coefficients are bounded. The latter is often used as a definition of a BA number, 
however, there is no analogous characterization for M > 1 (see, e.g., Schmidt (1980), Theorem 5F, 
p. 22). The above definitions of BA numbers and BA M-tuples have been also extended to cover 
BA systems of linear forms (see, e.g., Schmidt (1980), pp. 41) and their existence was proved by 
Perron, providing also an algorithm for constructing BA linear forms (see, e.g., Schmidt (1980), 
Theorem 4B, p. 43). Furthermore, it has been established the existence of uncountably many BA 
systems of linear forms (see Schmidt (1969)). 

In what follows, we are interested in the case of BA M-tuples. Although, as indicated above, 
an algorithm is available for constructing BA M-tuples on the real line, these do not necessarily 
belong to any specified interval of the real line. Furthermore, if M is strictly fixed (independent 
of q), one can treat B in (I2.2D as a positive constant; this, however, becomes impossible if the 
value of M is growing. Using the technique described in Schmidt (1980), Section 4, pp. 43-45, we 
show that one can construct a BA M-tuple Pi,P2, ■ ■ ■ , Pm of real numbers so that it lies in any 
specified interval (a, 6), a < b, of non-asymptotic length, of the real line, and derive a lower bound 
for B(M) in (12. 2p as M — > oo. This result is proved in Section [H 



max 

i=l,2,...,M 



\Piq-Pi\ > B{M)q~ 1 ' M , 



(2.2) 
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3 An adaptive wavelet thresholding estimator 

Let <£>*(■) and be the Meyer scaling and mother wavelet functions, respectively, in the real 

line (see, e.g., Meyer (1992) or Mallat (1999)). As usual, 



<p* k (x) = Vl\*&x - k), r jk (x) = 2J/ 2 r(2 J x - k), j, k 



x G 



are, respectively, the dilated and translated Meyer scaling and wavelet (orthonormal) basis func- 
tions at resolution level j and scale position k/2 3 . Similarly to Section 2.3 in Johnstone et al. 
(2004), we obtain a periodized version of the Meyer wavelet basis, by periodizing the basis func- 
tions {(/?*(•), V*(-)}> i-e-, for j > and k = 0,1,..., 2-? - 1, 

Vjkix) = y'\*(V{x + i)~k), ^fc(x) = 2 j7 V(2 j (x + i) - k), xeT. 

Let (•, •) denote the inner product in the Hilbert space L 2 (T) (the space of squared-integrable 
functions defined on T), i.e., (f,g) = f T f(t)g(fjdt for f,g G L 2 (T). Let e m (t) = e i27rmt , m G Z, 
and let f m = (e m ,f), g m {u) = {e m ,g(u,-)), u G U. For any j > and any j > j , let 
<Pmj k = {e m ,(p jok ) and if) mjk = {e m ,ip jk ), where {<f>j ,k('),i/>j,k(')} is the periodic Meyer wavelet 
basis introduced above. 

Using the periodized Meyer wavelet basis described above, and for any jo > 0, any (periodic) 
/(•) G L 2 (T) can be expanded as 

230-1 oo 27-1 

f(t) = ^2 a jok ip jok (t) + ^ E b jk ip jk (t), teT. (3.1) 

fc=0 j=jo k=0 

Furthermore, by Plancherel's formula, the scaling coefficients, a,j ok = (f,<Pj ok ), and the wavelet 
coefficients, bj k = (f,ipj k ), of /(•) can be represented as 

Qjok = ^} 1 fm^fmjoki bj k = ^ fm' t Prnj k j (3-2) 

where Cj = {m : tp m j ok ^ 0} and, for any j > jo, Cj = {m : Tp m jk 7^ 0}. Note that both Cj and 
Cj, j > j , are subsets of (27r/3)[-2- 7+2 , -Z>] U [2^ 2^'+ 2 ], i.e., 

\m\ G (2tt/3) [2 j ,2 j+2 ] (3.3) 

due to the fact that Meyer wavelets are band limited (see, e.g., Johnstone et al. (2004), Section 
3-1). 

Reconstruct the unknown response function /(•) G L 2 (T) in (|1.3[) as 

330-1 J-l 2^-1 

kit) + EE b jk I(\b jk \ > \j)i>jk(t), t G T, (3.4) 
k=0 j=j k=0 

where aj ok and bj k are the natural estimates of aj ok and bj k , respectively (see (|3.ip and (13. 2D ), 
given by 

®jok — ^ frn^Pmjok, bj k — ^ fmlpmjk- (3'5) 
m£Cj m£Cj 
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with f m obtained by 

M x , M 



^ 1=1 ' ^ Z=l ' 



n^GC/, Z = 1,2,...,M; 



here g m (ui) and y m (ui), 1 = 1,2,..., M, are the discrete Fourier coefficients of y(u, •) and g(u, •), 
respectively, obtained by applying the discrete Fourier transform to the equation (j 1 . 3 f) . Note that, 
in this case, 

g (ui) = l and gmin) = j(ui f m ^ mUl \ m£Z\{0} l = l,2,...,M. (3.6) 

The choices of the resolution levels jo and J and the thresholds Xj will be described in 
Section [6] when we examine an expression for the L 2 -risk of the estimator (|3.4p over a collection 
of Besov balls, leading to an adaptive estimator (i.e., its construction is independent of the Besov 
ball parameters that are usually unknown in practice). 

Among the various characterizations of Besov spaces for periodic functions defined on L P (T) 
in terms of wavelet bases, we recall that for an r-regular (0 < r < oo) multiresolution analysis 
with < s < r and for a Besov ball Bp q (A) of radius A > with 1 < p,q < oo, one has that, 
with s' = s + 1/2 - 1/p, 

b s p ,m) = {/(•) g LP ( T ) ■ ( E Ko fc r) P + ( E 2JA ( E w) P ) q - A }> (3 - 7) 

with respective sum(s) replaced by maximum if p = oo or q = oo (see, e.g., Johnstone et al. (2004), 
Section 2.4). (Note that, for the Meyer wavelet basis, considered considered above, r = oo.) 

The parameter s measures the number of derivatives, where the existence of derivatives is 
required in an L p -sense, while the parameter q provides a further finer gradation. The Besov 
spaces include, in particular, the well-known Sobolev and Holder spaces of smooth functions but 
in addition less traditional spaces, like the space of functions of bounded variation. The latter 
functions are of statistical interest because they allow for better models of spatial inhomogeneity 
(see, e.g., Meyer (1992)). 

The precision of the estimator (|3.4p is measured by the (maximal) L 2 -risk given by 

Rn(fn)= SUp E||/„-/||!. (3.8) 

/efl|, 9 (A) 

We are interested in the asymptotic rate of convergence of the estimator f n , i.e., we are interested 
in the following asymptotical upper bounds 

n as n — > oo, 

where {7^}^! is a positive sequence converging to as n — > 00 and C > is a generic constant, 
independent of n, which may take different values at different places. 

Hereafter, || • H2 denotes the L 2 -norm, / n (") is an estimator (i.e., a measurable function) of 
/(•) G L 2 (T), based on observations from model (|1.3|> . and the expectation in (j3.8l) is taken under 
the true /(•). 
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4 Relation to the theory of Diophantine approximation 



By direct evaluations (see also Lemma [5] and its proof in the Appendix), one can show that 



M 

E\b jk - b jk \ 2 = N^ 1 ^2 \^mjk\ 2 [^2\g m (ui) 

med 1=1 



Since in the case of Meyer wavelets, \ip m jk\ < 2 J / 2 and \Cj\ x 2- 7 (see, e.g., Johnstone et al (2004), 
p. 565), we derive 

E\b jk -b jk \ 2 = o(n- 1 Ai(j)), (4.1) 

where 

r m 

A lO0 = 17TT 



i=i 



W\ ^ [Tl(m)r 



with Ti(m) = Ti(m;u,M) = M 1 X)z=i ^^.(w/)! 2 , u = (ui,-u 2 , . . . ,%)■ By (|1.2p and (|3.6p . one 
has 

1 M 

Tl(m)X ^2^^ Sin2(27rmn ' ) ' (42) 

Here, u(m) x u(m) mean that there exist constants C\ > and C2 > 0, independent of m, such 
that < C\v{m) < it(m) < C2v{m) < 00 for every m. 

Therefore, the risk of the estimator / n (i) defined in fj3.4j) is determined by the rate of growth 
of Ai(j') as j — > 00 which, in turn, depends on the rate at which T\{m) goes to zero as m — > 00. 

It is easy to see that for some choices of M and u (e.g., M = 1, u± = u = 1), one has 
min m ri(m; u, M) = for every m which leads to an infinite variances of the estimated coefficients 
bj k and, consequently, to an infinite L 2 -risk. Hence, the choice of M and the selection of points u 
is of an uttermost importance. In particular, we want to choose points (u\, 112, ■ ■ ■ ,um) such that 
J2i-Li sm 2 (27rmui) is as large as possible for m G Cj and large j. 

Moreover, for any choice of M and any selection of points u, one has Ti(m; u, M) < K\mT 2 
for some constant K\ > independent of m, the choice of M and the selection points u, so that, 
for any j and selection of M and u, 

&i{])>K 2 2 2 i, (4.3) 

for some constant Ki > 0, independent of j. It turns out that if M = M n increases at least as fast 
as n 1//3 , then, by sampling m, I = 1, 2, . . . , M, uniformly on U, i.e., by selecting ui = a+(b—a)l/M, 
I = 1, 2, ... , M, one can attain Ai(j) < K^2 2 ^ for some constant K3 > 0, independent of j, so 
that the upper and the lower bounds in this case coincide up to a constant independent of n (see 
Pensky &; Sapatinas (2010)). 

Unfortunately, the above results do not hold for finite values of M or when M = M n is 
a slowly growing function of n. Indeed, in the case of small values of M, both ri(m;u, M) and 
Ai(j') have completely different dynamics from large M. Indeed, if M = 1, Johnstone & Raimondo 
(2004) and Johnstone et al. (2004) showed that in the case of 7(it) = 1/n, u% = u* = a = b, one 
has Ai(j') > X42 3j for any choice of u* and some constant > 0, independent of j. Johnstone 
et al. (2004) also demonstrated that if u* is selected to be a BA number, then the lower bound 
for Ai(j') is attainable, i.e., Ai(j) < K<£?i for some constant K§ > 0, independent of j. Hence, 
in this case, Ai(j) x 2 3j . 

These results were extended by De Canditiis & Pensky (2006) who studied the multichannel 
deconvolution model with a boxcar kernel and showed that the convergence rates obtained by 
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Johnstone et al. (2004) for M = 1 can be improved by sampling at several different points. In 
particular, they demonstrated that if M is finite, M > 2, one of the Ui, U2, ■ ■ ■ , um is a BA number, 
and u is a BA M-tuple defined in (|2.2|) . then 

Ai(j) < C(M) j2^ 2+1 /^) (4.4) 

for some positive C(M). In particular, when M is growing with n, the value of C(M) depends on 
n and, hence, affects the convergence rates of the estimator / n (0 as n — > oo. 

The relation between the convergence rates of the estimator f n ('), given by (|3.4p . of /(•) in 
the model (II. 3p and the theory of Diophantine approximation becomes obvious when one notes 
that in (|4.2|) . for any m G Z \ {0} and any «/, / = 1,2, .. . ,M, one has, combining the periodic 
behavior of the sine function together with a first order (linear) approximation, 

A\\2mui\\ 2 < sin 2 (2irmui) < it 2 \\2mui\\ 2 , / = 1,2,...,M, (4.5) 

where ||a|| = inf{|a — k\, k £ Z} denotes the distance from a real number a to the nearest integer 
number. Hence, (|4.2p becomes 

1 M 

ri(m)=Ti(m;u,M) x V||2m^|| 2 , (4.6) 

?rr M 

1=1 

so that the convergence rates of the estimator f n { ) depend on the lower bound, in terms of m, of 
the expression (|4.6p . 

The value of C{M) in (JOj) is related to the value of B(M) in ([22]). To the best of our 
knowledge, there has not been developed a procedure for construction of a BA M-tuple on a 
specified interval, of non-asymptotic length, of the real line, and there are no asymptotic lower 
bounds, in terms of M, on B(M) in (|2.2j) when the value of M is growing. For this reason, in 
order to find upper bounds of estimator (|3.4p and choose an optimal relation between the sample 
size n and the number of channels M when M = M n is a slowly growing function of n, we need 
to obtain new original results in Diophantine approximations. In particular, the objective of the 
next section is to construct a BA M-tuple on the non-asymptotic interval U, of the real line, and 
to obtain a lower bound on B{M) in terms of M for this BA M-tuple when M grows slowly with 
n. 



5 Construction of a BA M-tuple on a specified interval 

Below, we construct a BA M-tuple (3 = (/3i,/?2, • • • ,Pm) of real numbers on a specified interval 
(a, b), of a non-asymptotic length, of the real line, and derive the lower bound on B(M) in formula 
(12. 2j) . For this construction, we use the technique described in Schmidt (1980), Section 4, pp. 43- 
45. In particular, we shall provide an algorithm for construction of an M-tuple /3i,/?2, • • • ,@m °f 
real numbers such that, as M — > oo, 

1. it lies in any specified interval (a,b), a < b, of nonasymptotic length, of the real line, and 

2. it satisfies 

max \/3iq-pi\ > B exp(-6M In M)q~ 1/M , (5.1) 
i=l,2,...,M 

for any integer numbers q > and Pi,P2> • • • ,Pm, and for some constant Bq > 0, independent 
of M, q and pi,p 2 , ■ ■ ■ ,Pm, so that B(M) = S exp(-6MlnM) in (1231) . 
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Assume that M is large enough, fix a positive integer Q and consider 

P{x) = (x- Q)(x - 2Q) ■ ■ ■ (x - MQ) - 1, (5.2) 

a monic polynomial (i.e., a polynomial with a unit leading coefficient) of the degree M. Let 
£i)£2> ... , £m be the roots of a polynomial (|5.2[> . Recall that £ is called an algebraic integer number 
if it is a root of some monic polynomial with coefficients being integer numbers. Algebraic integers 
are called conjugate if they are roots of the same monic polynomial with integer coefficients. 

Then, the following statement is valid. 

Lemma 1 If Q > 5M , then £i,£2j •• • j£m are rea ^ conjugate algebraic integer numbers such that 

(i - 1/2)Q < & < (t + 1/2)Q, i = l,2,...,M. (5.3) 

Now, to construct a BA M-tuple, choose Q > 5(M+1) and construct real conjugate algebraic 
integers £1,^2, • • • ,^m,^m+1 using the process described in Lemma [TJ Let a = (0:1,02, • • • ,ocm) 
be a solution to the following system of equations: 

M 

Y J £t 1 m = -tiif, k = l,2,...,M. (5.4) 

1=1 

Observe that the determinant of the system of equations (j5.4|) is a Vandermonde determinant; 
hence, it is nonzero since £j 7^ £j for i ^ j- Therefore, the system of linear equations (|5.4|) has a 
unique solution a = (ati, 02, ■ ■ • , which turns out to be a BA M-tuple. 



Lemma 2 The solution a = (a\, ct2, ■ ■ ■ 0/ i/ie system of equations ( f5.^[ ) is a 5^4 M-tuple 
such that 

\a h \ < 30exp(3MlnM), fc = l,2,...,M, (5.5) 
and /or any integer numbers q > and pi,p2, • • • ,Pm> as M 00, one /ias 

max lotf-pil > C exp(-3MlnM)^ 1/M (5.6) 

z=l,2,...,_M 

with some constant C > 0, independent of M, q and pi,P2, ■ ■ ■ ,PM- 

Lemma[2]provides a BA M-tuple which, however, does not necessarily belong to the specified 
interval (a, b), of a non-asymptotic length, of the real line. Assume, without loss of generality, 
that both a and 6 are rational numbers, otherwise, replace (a, b) by (a*, b*) £ (a, b), where a* and 
b* are rational numbers. Let a = p a /qo and 6 = Pb/qo for some integer numbers p a , Pb and go> and 
let z be an integer number such that z — 1 < 30 exp(3MlnM) < z. Define 

(3 l = a + a l (b-a)/z, Z = 1,2,...,M, (5.7) 

where a = (a\, ct2, ■ • ■ , «m) is the BA M-tuple constructed in Lemma [2j 

The following theorem confirms that (3 = (/3i,/?2, • • • ,Pm), as constructed above, forms a 
BA M-tuple on the specified (a, 6), of a non-asymptotic length, of the real line. 



Theorem 1 The real numbers /S2, . . . , (3m defined in ( 5.7) lie on the interval (a, b), of a non- 
asymptotic length, and form a BA M-tuple, so that, as M — )■ 00, one has 

max \p iq - Pi \ > # exp(-6MlnM)<r 1/M , (5.8) 
i=l,2,. ..,M 

/or any integer numbers q > and pi,P2> ■ ■ ■ ,Pm> an d for some constant Bq > 0, independent of 
M, q andpi,p 2 , . . . ,Pm, so t/iat B(M) = B exp(-6M In M). 
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6 Asymptotical upper bounds for the L 2 -risk of the adaptive 
wavelet thresholding estimator 



In Section O we constructed a BA M-tuple and derived a lower bound on B(M) in (j2.2j) . as 
M — > oo. We can now choose the resolution levels jo and J, the thresholds Xj in (|3.4p and the 
optimal relation between the total number of observations n and the number of channels M = M n 
and derive asymptotical upper bounds for the L 2 -risk of the estimator /«(•) given by (|3.4p over a 
collection of Besov balls. 

In order to formulate and prove Theorem [2] we first need to obtain some preliminary results. 
Recall that ||a|| denotes the distance from a real number a to the nearest integer number. For this 
purpose, we recall the equidistribution lemma (see Lemma [3]), proved in Johnstone & Raimondo 
(2004), which we state here for completeness, and formulate a new lemma (i.e., Lemma 0j) which 
is based on application of Lemma [3] to the BA M-tuple. 

Lemma 3 (Lemma 1 in Johnstone & Raimondo (2004)) Letp/q andp'/q' be successive conver- 
gents in the continued fraction expansion of a real number a. Let N be a positive integer number 
with N + q < q' . Let h be a non-increasing function. Then 

q N+q q-3 

j>(i/ 9 )< £ h{\\ka\\)<2Y J h{i/q)+Qh{l/{2 q i)). 
i=4 k=N+l i=l 

Lemma 4 Let Pi, 02, ■ ■ ■ ,@m be a BA M-tuple constructed in Theorem [7] and let f3\ be a BA 
number. Let ro be an arbitrary fixed positive real number. Denote 

Mi, m) = J2 [ll^ill 2 + • • • + Wm\\ 2 ] ~ k > (e- 1 ) 

where £lj is defined as 

flj = {I : 2 j < |/| < 2 j+ro } . (6.2) 
If M is large enough, then, as j —> oo, 

X k (j,M) = O (j 2 ->( 1 +( 2fc - 1 )/ M ) e 6(2fc-l)AflnM^ ^ k = ^^3^ ( g 3) 

We also need the following two lemmas which evaluate the precision of estimation of aj k 
and bjk- 

Lemma 5 Let f3 = (/3i, fa, . . . , (3m) be a BA M-tuple constructed on the interval (2a, 26) according 
to Theorem^ and let one of j3\, fa, ■ ■ ■ , Pm be a BA number. Let the equation U.3\) be evaluated 
at the the point u with components u\ = fii/2, I = 1,2, ... , M. Then, for all j > jo, as n —> oo, 




Lemma 6 Let u±, u%, . . . , um be as in Lemma\^ If -q > is a constant large enough, then, for 
oil j > jo, as n ^ oo, 

P(fe " M 2 > ^K)^ 2 ^' Inn) = o (n- e ) , 
where n M = {n/M) exp(-6MlnM) and 9 = r ! 2 /(2C i> ) with = 2 -i |C j |. 
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thresholds Xj be such that 



We are now ready to formulate Theorem [2j Let the resolution levels jo and J and the 

J - (n M )^r, Xj = r] {n M y 1/2 J j2i( 2 +V M ) Inn, 



2 JO =lnn, 2 
for some constant 77 > 0, where 



(6.4) 



— exp(-6MlnM). 



(6.5) 



Note that since the construction of jo, J and Xj is independent of the Besov ball parameters, s, 
p, q and A, the suggested wavelet thresholding estimator f n {-) given by (|3.4|) is adaptive with 
respect to these parameters. 

The following statement provides the asymptotical upper bounds for the L 2 -risk, over a 
collection of Besov balls. 

Theorem 2 Let s > 1/ min(p, 2), 1 < p < 00, 1 < g < 00 and A > 0. Zei /3 = (/?i,/3 2 , . . . , ,5m) 
6e a -BA M -tuple constructed on the interval (2a, 26) according to Theorem [7] and /ei one 0/ 
f3i, /?2, ■ ■ ■ , Pmj sa U Pi> be a BA number. Let the equation M.Sfy be evaluated at the the point u 
with components ui = fii/2, I = 1, 2, . . . , M. Choose 



M = M n = v y/lnn/(\n Inn) 



3.6) 



for some v < independent of n. Let /«(•) be the adaptive wavelet thresholding estimator 

defined by ^-4\ ) with jo, J and Xj given by |6'.^| ), and um given by \b\ 5\) . Then, as n — > oo ; 



Rn(fn) < 



Cn~^a n , if s > 3(l/p-l/2), 

/ 

s 

C(W)^a n , if s<3(l/p-l/2), 



(6.7) 



where a n is given by 



a n = exp < Vlnn Vln In 



A 2 



3v + 



Aov 



+ r n 



with 



3Aiv In In In n ( 2lnv 



+ 



A2 In Inn 
In In In n 



^ln n Vln In 



In In In n 

3A X Ai 



j , Vln Inn / A3 Ai 
VhT^ V^2 2A 2 



Ai 



3Ai 



A| Af^ 



2Ao 



o(l) (n 



00 



where 



Ai = 2s, A 2 = 2s + 3, A 3 = 2s, 
Ax = 2s, A 2 = 2s + 3, A 3 = 4s, 
Ai = 2s*, A 2 = 2s* + 3, A 3 = 4s* 

u>z£/i s* = min(s', s), s' = s + 1/2 — 1/p. 



«/ 2 < p < CO, 

i/ 6/(2s + 3) < p < 2, 

i/ 1 < p < 6/(2s + 3) 



(6.9) 
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7 Discussion 



We considered the estimation problem of the unknown response function in the multichannel 
boxcar deconvolution model with a boxcar-like kernel when the number of channels grows as 
the total number of observations increases. This situation seems to be of a particular interest 
nowadays since data recording equipment is getting cheaper and cheaper while overall volumes of 
data is growing very fast. Our aim was to investigate the situation when the number of channels 
M = M n grows slowly with the number of observations n. 

For this purpose, we obtained new original results in the field of Diophantine approximation 
in order to devise a technique which allows the reconstruction of the unknown response function 
with a precision that differs from the best possible convergence rates (which can be attained in the 
corresponding continuous functional deconvolution model (11.41) ) by a factor which grows slower 
than any power of n. 

Specifically, in Section [61 we derived asymptotical upper bounds for the L 2 -risk of the 
adaptive wavelet thresholding estimator (|3.4p of /(•) G L 2 (T) in the model (11.30 . In comparison, 
it follows from Pensky & Sapatinas (2010) that the choice of a uniform sampling strategy (i.e., 
Ul = a + (6 - a)l/M, I = 1, 2, . . . , M, for M = M n > (32vr/3)(6 - a)n 1/s , leads to an adaptive 
wavelet block thresholding estimator f^(-) of /(•) with the following convergence rates 

(lunf , if s > 3(l/p - 1/2), 
Mfn)<< /, \ -rrj a (7-1) 

C(^V +1 (inn)*, if a<3(l/p-l/2), 



for s > 1/ mm(p, 2), where s' = s + 1/2 — 1/p and 

3maX 2 ( °+3 P " 1) » if *>3(l/p-l/2), 
max(0, 1 -p/q), if s = 3(l/p - 1/2), 
0, if s < 3(l/p- 1/2). 

Moreover, its has been shown that the above convergence rates with p = are the fastest possible 
ones (see Pensky & Sapatinas (2010)); hence, up to the logarithmic factor (hm) e , f^(-) attains 
the best possible convergence rates. By comparing the convergence rates (|6.7p in Theorem [2] to 
the fastest possible convergence rates (without the extra logarithmic factor (lnn) p appearing in 
(|7.ip ). one concludes that they differ by the extra factor a n defined in (|6.8p . 

How fast does a n — > oo as n — > oo? It can be easily seen that a n grows slower than any 
power of n but faster than any power of Inn, i.e., for any eti, a-i > 0, one has 

lim = 0, urn — r — = oo. 

n— >oo n a l n— >oo (ln/i) a2 

Hence, although choosing M = M n — > oo at a rate given by (|6.6|) improves the convergence rates 
in comparison with the finite values of M (see Pensky & De Canditiis (2006), Theorem 2), these 
rates are quite a bit worse than in the case when M = M n grows at a faster rate as n — > oo. 
Since, as we have explained in Section HI this fast growth of M = M n with n cannot be achieved 
in a number of practical situations, one has to resign to M n growing slowly with n, in particular, 
M = M n = o((lnn) Q3 ) for some a 3 > 1/2. 

The interesting question, however, is whether the convergence rates (16. 7p can be improved. 
To uncover an answer to this question, one needs to either come up with another procedure for 
constructing a BA M-tuple which belongs to a specified interval, of a non-asymptotic length, of 
the real line and delivers a higher value of B{M) in (12. 2p . as M — > oo, or to show that no matter 
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what the value of u = (1*1,1*2, • • • ,um) is, there exist integer numbers q and pi,P2, ■ ■ ■ ,Pm such 
that, as M — > oo, 

max \uiq-pi\ < B 1 exp(-6M In M)q~ 1/M , 

for some positive constant B\ independent of M, q and pi,p2, ■ ■ ■ ,Pm- At the moment we are 
unable to provide answers to either of the above questions; we challenge, however, the number 
theory community to work on the issue. Derivation of these results will not only enrich the 
theory of Diophantine approximation but will also be valuable for the theory of statistical signal 
processing. 



8 Proofs 

Proof of Lemma [TJ Observe that for P(x) given by (|5.2p . one has 

P((M + 1/2)Q) >0, (-1)P((M - 1/2)Q) > 0, (-l) M P(Q/2) > 0, 

so that P(x) has M real roots £i,£2, • • • ,£m such that (|5.3p is valid. By definition, £i,£2, ■ ■ ■ ,£,M 
are algebraic integer numbers. Let us show that no proper subset of £i,£2,--- >£at is itself a set 
of conjugate algebraic integer numbers. For this purpose, note that 



Q(\j -i\- 1/2) < |6 - jQ\ < Q(\j -i\ + 1/2), i / j, i,j = 1,2,. 
Therefore, by (15. 2p . for any i = 1, 2, . . . , M, 



,M. 



•1) 



< |& - *Q| 



ill 



^O-^IlCb"- 4 !- 1 ^)- 1 . 



(8.2) 



i=i 



Now, assume that 6i,£i2> • • • >£im, h < ■ ■ ■ < im and m < M, form a set of conjugate real integer 
numbers. Then, P^ihQ) = (6i ~~ • • • (£i m ~ hQ) is an integer number and is not equal to 
zero, hence, |P^(iiQ)| > I. On the other hand, by d^TJ) and (|Q) . 



1 < |P^(iiQ)| < Q 



-(M-l) 



1/2)- 1 fl[Q(|t fc -ii| + l/2)]. 



fc=2 



The product in the right-hand side above takes the largest value if m = M — 1, i\ = 1 and 
&fc = k + 1, A; = 2, 3, . . . , M — 1. In this case, for M > 2, combination of the last two inequalities 
yields 

1 < \P^(hQ)\ < 4(M - 1/2)Q~ 1 < 5M/Q, 



which leads to a contradiction when Q > 5M. 



□ 



Proof of Lemma [2], Choose Q = 5(M + 1) and construct real conjugate algebraic integer 
numbers £i,£2, • • • )£m>£m+i using the process described in Lemma CD Then, by (|8.2|) . 6 Qi. 
Let g > and p = (pi,P2, ■ ■ ■ ,Pm) be integer numbers and denote 



M 



Hk(q,p) = Yl ti^Pi + fe = 1,2, . . . , M + 1. 



i=l 
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Note that if p is not zero and the components of the vector p/q are not integer numbers, then 
Hk(q,p) 7^ 0. Furthermore, H\{q,p), H2(q,p), ■ ■ ■ , Hm(q,p), Hm+i{q,p) are themselves real con- 
jugate algebraic integer numbers and, thus, 

M+l 

II \Hk(Q,p)\ > 1. 
fc=i 

Now, note that (|5.4jl implies that ai, (X2, ■ ■ • > C-M are coefficients of the monic polynomial with the 
roots £i,£2, • • • > Cm- Also, it is easy to check that for the solution a = (01,02, ■ ■ ■ ,«m) of the 
system of equations (|5.4p . one can write 



M 



8=1 



Moreover, if one denotes 



M 



UM = ^ O^M+l + £m+i> 

then HM+i(q,p) can be written as 

H M +i(q,p) = ^2?M+i(Pi - mq) + 



i.3) 



(8.4) 



i=l 



3.5) 



i=i 



Note that (j8.4f) implies that w« is the value of the polynomial 

M 



P(X) = £ 



- 6) • • • (z - £ 



i=l 



at the point £m+i- Therefore, by dSTTJ and (PT2"]1 , < KM\Q M for some constant if > 0. 

Recall that ||a|| denotes the distance from a real number a to the nearest integer number. 
Denote L = max^i 2,. m \dq ~ Pi\- Note that we can assume that Pi/q, i = 1,2, . . . , M, are not 
integer numbers. Otherwise, if, for instance, pi/q = z is an integer number, then L > g|£i — z\ > 
q\\£l\\ and (|5.6p is valid. If L > 1, then (|5.6p is valid. Hence, consider the case of L < 1. Then 
L < |</| and, by (|8,3p . we have 

M 

|^fc(?, P)l < ^E^ 1 < L £*7(6 - 1), k = 1,2, . . . ,M. 

i=l 

Then, using (j8.5j) and an upper bound for ojm, we obtain 

\H M+1 (q,p)\ < \q\($ +1 /(S M +l ~ 1) + KM\Q M ). 

Since Hi(q,p), H.2{q, p), ■ ■ ■ , H]\,f(q,p), HM+i(q,p) are real conjugate algebraic integer numbers, 
one has 



M+l 



M 



1 < n i#*(?> £)i ^ n 



*!=i 



fc=i 





\q\ 


L&-1. 





Cm 



4 ' 1/ - J • A\/!Q W 



1 
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Note that, by fl8J2]) and Q > 5M, one has |£ M +i - (M + l)Q\ < Q~ M and, hence, \Cm+i\ M < 
2Q M {M + l) M . Therefore 



M 



L > K\q\- l / M ][ 



fc=i 



~Qk - r 


l/M r 


_k M Q M _ 





(M + l)^ 
Q(M + 1)-1 + ^ M ; 



-l/M 



Plugging in Q = 5(M + 1) into the expression above, we obtain 

L > B{M)\q\- l l M 

with 



M 



B{M) = K [5(M + lJJ-^^^M!)- 1 JJ[5fc(Af + 1) - l] 1 



/M 



fc=l 



(M + l 



5(M + 1) 2 - 1 



+ M! 



-l/M 



Using Stirling formula, 

M! = \/2^(M + l) M+1/2 exp(-(M + l))(l + o(l)), as M -> oo, 

(see, e.g., formula 8.327 of Gradshtein and Ryzhik (1980)) and the fact that ln(M + 1) < ln(M) + 
l/M, after some simple algebra, we obtain that, as M — > oo, 

B(M) > C exp(-3MlnM), 

for some constant Co > 0, independent of M, q and p, which proves (15. 6p . 

Now, it remains to prove the upper bound (|5.5p for a^, k = 1,2,... ,M. For this pur- 
pose, recall that a±, c*2, ■ ■ ■ , olm are coefficients of the monic polynomial with roots £i, £2, ■ ■ ■ , Cm- 
Therefore, using (18. 2p . obtain 



M 
k 



MQ(M - 1)Q ■ ■ ■ (M - k + 1)Q = k\ 



M 



Q k , k = l,2,. 



M. 



Since for any k = 1, 2, . . . , M, h k /k\ < 625/24 < 30, Q = 5(M + 1) and [M + l)(Af - j) < M 2 for 
j > 1, one has (reading Iljio = ■"■) 



fc-i 

|n/,| £ — (M + l) k l[(M-j) 



5 k 
k\ 



3=0 



k-2 



< 30 M 2 [(M + 1){M - k + 1)} 2 JJ[(Af + l)(Af- j) 2 } 

3=1 

k = l,2,...,M, 



< 30M 3k < 30e 3MlnM 



which proves (|5.5 j) . 



□ 



Proof of Theorem [JJ It is easy to check that /3i, /?2, • • • , Pm-i as defined by (|5.7p . lie on (a, b). 
Furthermore, by Lemma [2] and the fact that z < 30 exp(3MlnM), asM-f 00, one has 



max {fag-pi 



= (zq ) 1 max \a k (p h - p a )q - (zq pi - zp a q)\ 

i=l, ■■■ ,M 

> (zqo)- 1 Co\( Pb - Pa )q\~ 1/M exp(-3M In M) 

> S exp(-6MlnM) \q\~ 1/M , 
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for any integer numbers q > and pi,P2, ■ ■ ■ ,PM, an d for some constant Bq > 0, independent of 
M, q and p. □ 

Proof of Lemma [4j Recall first that any real number a, which is not an integer number, may 
be uniquely determined by its continued fraction expansion 

1 

a = a + 



Gil + 



a-2- 



a 3 + - 



where ao is an integer number and 01,02,... are strictly positive integer numbers. The con- 
vergents Pkllk = Pk( a ) I 'Qk( a ) > k = 0,1,..., of a are those rational numbers, the continued 
fraction expansions of which terminate at stage k, that is, po/qo = oo> Pi/qi = a o + V a i> 
P2/Q2 = 00 + 1/(01+1/02)! and so on. The denominators in the above expansions grow at 
least geometrically 

q n +i > 2^/ 2 q n , if iodd, (8.6) 
q n +i > 2 l//2 g n , if i even, 

and a n < q n jq n -\ < a n + 1, n > 1. A real number a is BA if sup n a n < 00, i.e., there exists q > 
such that 

qn/qn-i <q, n>l (8.7) 

(see, e.g., Schmidt (1980), Sections 3-5, pp. 7-23). 

Let p/q and p' /q' be successive principal convergents in the continued fraction expansion of 
j3\. Let iV be a positive integer number with N + q < q' . Then, application of Lemma [3] with 
h(x) = x~ l yields 

N+q 

^ WlfcW- 1 =0(qlnq), (8.8) 
l=N+l 

since q' < qq by §81} . Now, note that by (f5T8|) 

N+q N+q 

Y, (ii//3i|| 2 + --- + ii//3m|| 2 )^< E \\Wi\r 1 [^(\\Wi\\,---AWM\\)]- {2k - 1) (8.9) 

l=N+l l=N+l 



Combination of ([5S]), (J8I8D and $8M) implies that 

N+q 

j2 (Wifof + ... + \\i/3 M \\ 2 y k = o (eW^kM a (i+(2fc-D/M) lnq y k = 1)2)3j4 (8 . 10) 

Now, observe that the set of indices I in fij is symmetric about zero, and so are the com- 
ponents of the sum. Hence, we can consider only the positive part of f2j which, with some abuse 
of notation, we keep calling it Qj. Let qi be the denominators of the convergents of and let 
I be the smallest number such that qi > 2 3 . The geometric grows of denominators ()8.6[) implies 
that 2 j+r ° < 2 r °qi < qi +2ro so that Qj C [qi-i,qi+2r )- If we denote D s = N n [qi+ s -i, qi+ s ), 
s = 0, 1, . . . , 2ro, then 

2r 

^ c \Jd s . 

s=0 
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Since, by (|8.6p . qi+i < qqi, there are at most q disjoint blocks of length qi+ s -i that cover D s . 
Applying (|8.10p to each of those blocks, we derive 



L«(2 M «l../ (lJ(ti _ i) u( a -,)/» lng l+ ,^), * = 1,2,3,4 



E(X><ii ! ) 

l£D a \i=l / 

Note that qi-\ < 2- 7 , so that <ft+ s _i < q s qi-i < (7 S 2 J . Therefore, 



2r 

°IEE(E 

=0/GD s \i=l 



2r 



s=0 



q / e 6(2 fc -l)MlnAf ■ 2 i(l+(2 fc -l)/M)\ _ fc = ^2,3,4, 



proving, thus, (|6.3p . 



□ 



Proof of Lemma [5]. In what follows, we shall only construct the proof for the term involving 
bjk since the proof for the term involving aj f. is very similar. Denote 

-2k 



i r i M 

Mi) = Tq7\^2 — r Y.\3m{ui) 



M 

meCj L i=i 



1 M 
1=1 



I 2k 



, K = l,2, 



where Tx(m) is given by (|4.2p and (|4.6p . Note that, by (13.21) and (I3.5p . we have 



fm-fm = N-^Hr 1 



M 



X] \9m{ui) 



1=1 



M 



where z m i are standard Gaussian random variables, independent for different m and I. Therefore, 
since in the case of Meyer wavelets, \ifi m jk\ — 2~^ 2 and |C,-| x 2 J (see, e.g., Johnstone et al (2004), 
p. 565), we derive that IE 1 6^- ^ — bjk\ 2 is given by expression (|4.ip . If k = 2, then 

Efefc - 6 jfc | 4 = O E \fm - /ml' I + Ol Yl E \fm ~ fm\ 

\m£Cj J \ m&Cj 

=0 (N-^M-^nim)}- 4 ]T J2\9mMA +0{N- 2 M- 2 2-^[r 1 (m)]' 2 ) 



=0 (2^N- 2 M- s A 2 (j) + N- 2 M~ 2 Af(j)) = O (n _2 [M _1 2^ A 2 (j) + A((j)}) • ( 8 - n ) 

Now, recall that \g m (ui)\ - |m| -1 ||mA|| by P~5|) . Note that, by formula (foTT|) . tt k (j,M) 
is increasing in ro and recall that, by the definition of the Meyer wavelet basis, one has \m\ G 
[(271-/3)2-?, (87r/3) 2-'] C % with r = 3 + log 2 (7r/3) (see ([33D and (JO])). Then, direct calculations 
yield 

A 1 (j) = 0(2nm(j,M)) and A 2 (j) = 0(2 3 ^M 4 tt 4 (j, M)). (8.12) 

To complete the proof, combine flUD, (|8TL|) and (f87L2|) and note that Mj' l 2-^ l - h / M ) = 

o(l) as n — > oo, since 2 3 > 2- 70 = Inn and M = M n — )• oo as n — > oo. □ 



18 



Proof of Lemma [6], It is easy to see that bj k — bj k follows a Gaussian distribution with mean 
zero and variance bounded by ^-{nM)~ l j2^ 2+l / M \ Hence, 



M 2 > TftnurWWM Inn) < 2$ (^V^j = O {^^^j , 



where $(•) is the cumulative distribution function of a Gaussian random variable with mean zero 
and variance one. □ 

Proof of Theorem [2j Due to the orthogonality of the Meyer wavelet basis, we obtain 

E||/n " f\\l = R0 + R1 + R2 + R3 + R4, 



where 



Denote 



2»o-l 00 2-J-l 

R o = Y E @j k ~ a jQ k) 2 , R i = YY 

k=0 j=J k=0 

J-l 2J -1 

R2 = Y Y E ^k - b jk ) 2 I(\b jk \ > Xj)] I(\b jk \ < Xj/2), 

3=30 k=0 
J-l 2J-1 

R z = T f J2 b h ¥ ^jk\<^m\b jk \>2\ j ), 

3=30 k=0 
J-l 2J-1 

Ri = Y Y E $3k ~ bjk) 2 l{\b jk \ > Xj)} I(\b jk \ > Xj/2), 

3=30 k =0 
J-l 2J-1 

K 5 = EE b%¥(\b jk \ < Xj)I(\b jk \ < 2X 3 ). 

3=30 k=0 



«..*)- 2(3 + 1/M) 



2s + 3 + l/M 

and observe that M) < 2 for s > 1/ min(p, 2). First, consider the terms Rq and Using 
Lemma [SJ it is easily seen that 

R = O (n^^^Oo.M)) = o (V 1 j 2 Jo(2+1/M) e 6MlnM ) = ((MnM) _1 In 3 n) = o ((n M ) _2s+3 + 1/M ) • 

Furthermore, it is well-known (see, e.g., Johnstone (2002), Lemma 19.1) that if / G -Bp g (A), then 

for some positive constant c*, dependent on p, q, s and A only, we have Yl"k=o ^jk — c*2~ 2js * and, 
thus, 



-2Js* 



o (Wr 2s * /(3+1/M) ) 



Ri = \T 

By direct calculations, one can check that if 2 < p < 00, then s* = s and hence 

i? 1 = ((n M )- 2s/(2s+3+1/M) ). 

On the other hand, if 1 < p < 2 then s* = s + 1/2- 1/p. If C(s,M) < p < 2 then 2s*/(3 + 1/M) > 
2s/(2s + 3 + 1/M) and, hence, 



R l = ((n M )- 2 */( 2s+3+1 / M )) . 
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Similarly, if 1 < p < ((s, M), then 2s*/(3 + 1/M) > 2s*/(2s* + 2 + 1/M) and therefore 

R 1 =0({n M )- 2s */ { ~ 2s * +2+l l M ^ 



Now, consider the term i?2. Using Lemma [5] and Lemma [6] with 6 > 2, formula (|6.4p . and 
the fact that e MlnM = o(n a ) for any a > as n — > oo, after some simple algebra, one derives 



J-l 2J-1 J-l 2^-1 



^2 < E E E [ft* " 6 i*) 2 I(fefc - ^ V 2 )] < 5^ ^[ftfc - b jk )^F(\b jk - 6 jfc | 2 > 



i=io fc=o j=jo k=o 

„ „ „15MlnM\ 



£1 ^ Me 21 M In M -2i(2+l/M) \ / 2 J(3+l/M) ^ n g 15 M In M \ 

( E E ^ )=°[ —e = o ((n M y 



yj=j k=0 

For the term R%, again applying Lemma [6] with 9 > 2, obtain 

J-ltf-l / j-i \ 

i=io fc=o y=jo / 

Now, consider the term R4. Let ji be such that 

2* = O ((n M ) 1/{2s+3+1/M) (lnn)« ) , 

for some real number £o- First, consider the case when p > £(s, M). Then, 

J-l 2^-1 

^4 < E E E tft fc - 6 J fc ) 2 K (i 6 ifei ^ V 2 ) = + ^ 42 ' 

3=30 k=0 

where 

ji 2 1 j-i 2 1 

Rai = E E E [ft fc " b i fc ) 2 ^l 6 ^! ^ V 2 )> ^ = E E E tft fc - 6 J fc ) 2 Kfcl ^ V 2 )- 

J=io A:=0 j=jl+l fc=0 



Then, Lemma [5] yields 

>1 



i? 41 = O I E EVm)" 1 J 2^ 2+1 / M ) I = O ((n M )' 2s/(2s+3+1/M) (lnn) 1+ ^( 3+1 / M )) . 



\j=jo k=0 J 

For term R42, one derives 

J-l 



^2 = o E E^) -1 ^ 2 " 



1/M) 



|A 7 '| p 

= O [(nAfJ-^^Onn) 1 -^ E 2^( 2+1 /^)(i-f/2)- S >] j = (( nM )-w(l nn )p 2 ) , 

V j=ji+i / 

where pi = -2s /{2s + 3 + 1/M) and p 2 = 1 - P - £0 [ps* - (2 + l/M)(l -jj/2)]. Now, choosing w 
£0 = — 2/(2s + 3 + 1/M), and combining the above terms, one easily arrives at 



_R 4 = o [{n m) 2s+3 + 1/m (In ra) 2H-3+1/M ^ 
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Now, consider the case when 1 < p < £(s, M). Note that, same as above, 

i?4i = O ((n M )- 2s/(2s+3+1/M) (lnn) 1+ «»( 3+1 / M )) ; 

but £o does not need to be the value chosen above. Observe that, since for R4 one has \bjk\ < 
c*2~i s and \bjk\ > Aj/2, then, combination of these inequalities requires j < j% where j 2 satisfies 
j 2 2 j2 = O (n M /ln n) 2s *+ 2 + 1 /M . Then, |6 jfc | < Aj /2 if j > j 2 + 1 and 

i? 42 = O ((n M ) _{1_p/2) (lnn) 1 ^ 2 J ' 2 [(2+i/Af)(i- P /2)-«* P ]^ = Q (( nM )P3( lnn y>4) ; 

where p 3 = -2s*/(2s* + 2 + 1/M) and p 4 = 2a*/(2«* + 2 + 1/M) -p/2- [(2 + 1/M)(1 -p/2) -ps*]. 
Noting that, in this case, s/(2s+3+l/M)-s7(2s* + 2+l/M) > and one arrives at i? 4i = o(i? 42 ) 
as n — > 00. Therefore, 



^ / /lnn\ ™p ,_£_r r2+ xvi_Ev» s *l 1 „ / /lnn\ 2.-+2+1/M \ 



since the power of In n in the expression above is negative. 

Finally, consider the term R§. First, consider the case when ((s,M) < p < 2. Let j$ be 
such that 

2^3 = O ((n M ) s * (2s+ ^ +l/M) (Inn) 51 ) , 
for some real number £1. Then, 

J-l 2J-1 

i?5 < E E b h < 2X ^ ^ R51 + ^ 52 ' 

j=jo fc=0 

where 

Ji 2J-1 2 is 2*-l 

^51= E E 6 ^ = (( n ^)"™ 7I? ( lnn )^ 1 )' ^2 = E E^ I (i^i< 2 ^)- 

i=i3+i fc=o j=io fc=o 

Let 

2-?-l 

= E ^1^1 < 2A ^- 



fc=0 



Note that 



and also 



E(j) = O (2>A?) = O (j 2^ 3 +VM) lnn (nM) -i) 



'2^-1 



so-) = o (^e im p \ b ^\ 2 ~ p < 2A ^)j = ° ( A r p 2_j>s 

= O ([timY' 2 - 1 (lnn) 1 ^/ 2 j 1 "'/ 2 #'K 2+ £)(i-§)-p**]) . 

Let j'4 be such that 

2 jA = O ((nA f ) 2s +3+i/M(i nn )6) j 
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for some real number £2- Then 

J=JO 3=34+1 \j=jo 

+ 0[ jr {n M y/ 2 -\\nn) l -vl 2 j l - p/2 ZM 2 +Mi-V-P°*l 

\J'=J4+1 , 

= O ((n M ) _2s+; "+ 1/M (lnn) 2+ « 2 ( 3+1 / M )) + O ((n M )^ s+3 + 1/M (lnn) 2 - p+ ^ 2+ i? X 1 " 
Since the bound for .R52 is valid for any value of £2, we choose £2 which minimizes 

max(2 + 6(3 + 1/M), 2 - p + &[(2 + 1/M)(1 - p/2) - ps*}), 
i.e., 6 = [s* + 1 + 1/p + 1/(2M)]- 1 (2s* + 2/p - 1). Hence 

( 2s 2s* + 2/p-l 

(n M ) 2s + 3 +!/m (i n n ) s *+i+i/ P +i/(2M) 

Choose now £1 = — 2/(2s + 3 + 1/M). Then, combining the .R51 and R52 terms, obtain 

(/ l n n \ 2s+3+l/M 2a \ 

( — J (lnn) 23 + 3 +VM I . 

Now, consider the case when 1 < p < ((s, M). Let j§ be such that 

2 j5 = O ^(lnn/n M ) 23 ' + ^ +1/M (hin) ?3 ) , 
for some real number £3. Then 

R5 < -^51 + -^52 + ^53) 

where 

J-l 2^-1 J4 JB 

^1= E E 6 ^ ^52 = E H W' ^3= E s t?')- 

3=35+1 k=0 j=j j=J4+l 

It is immediate that 

7-1 \ I , 2s* 

In n \ 2s*+2+i/m 



o'=i5+i 



and that 



V?=.?0 

After some simple algebra, one obtains 

is 



R52 = o e - — — — = ( ( n ^) 2s * +2+i/M ) 



i? 53 = o[ E (^m)^ 2 - 1 (Inn) 1 "*/ 2 2^[(2+^)(i-|)-^] 

- Ol ihjl)^™ (lnn) l-p/2 + 6[(2 + i)(l-§)-^ 

n M J 
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Choosing £3 = — l/(2s* + 2 + 1/M), and combining the above terms, we arrive at 



In 77 \ 2s*+2+l/M 2a* 
/?, O [ ( — - J ( lnn ) 2s * + 2+l/M 



Finally, consider the case, 2 < p < 00. In this case, j'3 = j'4, and we easily see that 



Rh = 



lnn\ 2s+3+i /m 



Combining all the above expressions, we obtain that, as n — > 00, 



Rn(fn) 




if 2 < p < 00, 
" ^ T ^^ riT \ (i n „) j if C(s, M) < p < 2, 

(Inn) 28,1 + 2 +VM, if I <p < C(s,M). 



.13) 



Now, note that 6/ (2s + 3) < C(s, M) for any M > 0. Hence, if p < 6/ (2s + 3), then p < £(s, M). 
On the other hand, if p > 6/ (2s + 3), then it is easy to show that for M large enough one has 
p > C(s, M). Observe also that p > 6/(2s + 3) if and only if s > 3(1 /p - 1/2). 

The upper bound in (18.131) depends on the choice of M = M n . Choose M n of the form (|6.6h . 
Then, from the definition of um an d formulae (|6.9p and (|8.13p . it follows that 



Rn(fn) = 0( exp {-(A 2 + l /My 1 [At ln(n - 6M In M - In M) - A 3 In In n] }) . 



.14) 



Using Taylor expansion, we write (A 2 + 1/M)" 1 = A^ 1 - M~ l A^ 2 + M' 2 A^ + 0(M~ 3 ) as 
M —> 00. Recalling that InM = lnz^ + 0.5 In Inn — 0.5 In In Inn and plugging expressions for 
M, InM and (A2 + 1/M) -1 into the argument of exponent in (|8.14p . by direct calculations, one 
derives that 

i?„(/„) = 0(exp(-(A 2 )- 1 A t lnn + A n )), 

where, as n — > 00, 



dnn Vlnln 



n 



Ai 
A 2 
At 



3u + 



1 



Aou 



3Atv vban In In In n / 2 1nv 



,1,ln " l T 2 + 2A 



3Ai Ai 



A2 Vln In n 
Ai 



+ In In In n 



2A 2 



In In In n 



Now, to complete the proof, note that the main term in A n is minimized by v = v op t = (3A 2 ) 
and that A 2 > 2 for any s > l/min(p, 2). □ 
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